| Subject | ApoB | ΔNCPV |
|---|---|---|
| A | 120 | 20 |
| B | 185 | 20 |
| C | 250 | 20 |
Plaque Begets Plaque, ApoB Does Not
The Statistics Cause Doubt
John Slough
Plaque Begets Plaque, ApoB Does Not — Soto-Mota et al., 2025
Trial ID: NCT05733325
Design: 1-year prospective cohort using coronary CT angiography (CCTA)
Participants: 100 adults metabolically healthy adults meeting lean mass hyper-responder (LMHR)
Plaque begets plaque: a mathematical artifact and clinically uninformative?
“All baseline plaque metrics (coronary artery calcium, NCPV, total plaque score, and percent atheroma volume) were strongly associated with the change in NCPV.”
“Many authors and pharmaceutical clinical trialists make the mistake of analyzing change from baseline instead of making the raw follow-up measurements the primary outcomes, covariate-adjusted for baseline.” - Frank E Harrell Jr, PhD, Professor of Biostatistics, Vanderbilt University School of Medicine
Change in noncalcified plaque volume \(\Delta \text{NCPV}\) was the outcome:
\[ \Delta \text{NCPV} = \text{NCPV}_{1} - \text{NCPV}_0 \]
They regressed \(\Delta \text{NCPV}\) directly on baseline values, for example:
\[ \Delta \text{NCPV} = \alpha + \beta \, \text{NCPV}_0 + \varepsilon \]
Change scores (ΔNCPV) can obscure clinically meaningful patterns
ΔNCPV is the same — but is that meaningful?
| Subject | ApoB | ΔNCPV |
|---|---|---|
| A | 120 | 20 |
| B | 185 | 20 |
| C | 250 | 20 |
Change scores (ΔNCPV) can obscure clinically meaningful patterns
ΔNCPV is the same — but is that meaningful?
| Subject | ApoB | NCPV₀ | NCPV₁ | ΔNCPV |
|---|---|---|---|---|
| A | 120 | 20 | 40 | 20 |
| B | 185 | 76 | 96 | 20 |
| C | 250 | 130 | 150 | 20 |
Change scores combine error from both baseline and follow-up, inflating model residual variance and weakening effect estimate.
| Model | Residual SE | SE (ApoB) | p-value (ApoB) | R² |
|---|---|---|---|---|
| ΔNCPV ~ ApoB | 34.42 | 0.065 | 0.3300 | 0.01 |
| NCPV1 ~ NCPV0 + ApoB | 15.20 | 0.029 | 0.0052 | 0.92 |
Mathematical Coupling
They regressed \(\Delta \text{NCPV}\) directly on its baseline value \(\text{NCPV}_0\):
\[ \Delta \text{NCPV} = \alpha + \beta \, \text{NCPV}_0 + \varepsilon \]
But this introduces mathematical coupling, because \(\text{NCPV}_0\) appears on both sides of the equation:
\[ \text{NCPV}_{1} - \text{NCPV}_0 = \alpha + \beta \, \text{NCPV}_0 + \varepsilon \]
Regression model: \(\text{NCPV}_{1} - \text{NCPV}_0 = \alpha + \beta \, \text{NCPV}_0 + \varepsilon\)
\(\beta = \frac{\operatorname{Cov}(\Delta \text{NCPV},\ \text{NCPV}_0)}{\operatorname{Var}(\text{NCPV}_0)} = \frac{\operatorname{Cov}(\text{NCPV}_1 - \text{NCPV}_0,\ \text{NCPV}_0)}{\operatorname{Var}(\text{NCPV}_0)}\)
\(= \frac{\operatorname{Cov}(\text{NCPV}_1,\ \text{NCPV}_0) - \operatorname{Var}(\text{NCPV}_0)}{\operatorname{Var}(\text{NCPV}_0)} = \frac{\rho\, \sigma_1 \sigma_0 - \sigma_0^2}{\sigma_0^2}\)
\(= \frac{\rho\, \sigma_1 - \sigma_0}{\sigma_0} = \rho \cdot \frac{\sigma_1}{\sigma_0} - 1\)
where:
With baseline NCPV contributing to both predictor and outcome, the slope reflects a mix of true correlation (\(\rho\)), variability ratio (\(\sigma_1/\sigma_0\)), and structural bias from subtracting \(NCPV_0\) from both sides.
It is not a clean estimate of baseline influence.
From Oldham test* (1962): \(\beta > 0 \quad\text{if}\quad \rho > \frac{\sigma_0}{\sigma_1}\)
The slope depends on:
Because coupling alone pushes \(\beta\) downward by 1, a positive \(\beta\) is possible only when \(\rho \cdot \frac{\sigma_1}{\sigma_0} > 1\). That inequality can be satisfied without any causal baseline effect, for example, if \(\sigma_1\) exceeds \(\sigma_0\) because of simple growth, or if measurement reliability inflates \(\rho\). A positive slope therefore does not by itself demonstrate a biological baseline influence; it merely tells us that the positive \(\rho \cdot \frac{\sigma_1}{\sigma_0}\) component has outweighed the −1 artifact.
In this study:
A positive slope in a mathematically coupled regression signals only that baseline–follow-up correlation and/or variance growth were strong enough to offset the unavoidable −1 artifact. Whether it reflects true biology, and whether it exaggerates or understates that biology, cannot be determined from \(\beta\) alone.
In other words, we cannot just add 1 to the resulting beta. It is more complicated than that.
*Source: Oldham, 1962, J. Chronic Dis.
n <- 100 # Set the number of values to generate
baseline <- rnorm(n, mean = 100, sd = 10) # Create 100 random numbers centered around 100
follow_up <- rnorm(n, mean = 120, sd = 10) # Create another 100 random numbers, centered around 120
delta <- follow_up - baseline # Subtract the first set from the second to get the differenceNo true association
Association due to mathematical coupling
An alternative is to model NCPV at follow-up (\(\text{NCPV}_1\)) directly while adjusting for baseline NCPV. This example uses ApoB as the independent variable:
\[ \text{NCPV}_1 = \alpha + \beta_1 \text{NCPV}_0 + \beta_2 \text{ApoB} + \varepsilon \]
This approach avoids mathematical coupling, reduces residual variance, and allows the coefficient on \(ApoB\) to reflect biological association — not algebraic structure.
This is an example of an ANCOVA-style model, where the follow-up outcome is regressed on both baseline and the predictor of interest.
ANCOVA is preferred over change score analysis because it adjusts for baseline without inducing mathematical coupling and typically improves statistical power and interpretability.
A slope from ΔNCPV ∼ NCPV₀ may reflect math, not biology, because the outcome is built from the predictor. Using change scores also compounds error and weakens model precision. To isolate real effects and preserve clinical meaning, we should model follow-up directly while adjusting for baseline.
“Statisticians have repeatedly warned against correlating/regressing change with baseline due to two methodological concerns known as mathematical coupling and regression to the mean.” - Tu & Gilthorpe, 2007
“Many authors and pharmaceutical clinical trialists make the mistake of analyzing change from baseline instead of making the raw follow-up measurements the primary outcomes, covariate-adjusted for baseline.” - Frank E Harrell Jr, PhD, Professor of Biostatistics
Plaque begets plaque: a mathematical artifact and clinically uninformative?
At best: hypothesis-generating.
At worst: misread as proof.
“Linear models on the primary (NCPV) and secondary outcomes were univariable”
Despite having multiple predictors available (age, sex, ApoB, BMI, Triglycerides, Systolic blood pressure, CAC, NCPV₀, LDL-C exposure),
each was tested separately in single-predictor regressions.
This modeling choice leaves estimates vulnerable to omitted-variable bias:
“…omitting a relevant variable from a model which explains the independent and dependent variable leads to biased estimates.” - Wilms (2021)
A predictor may appear significant in a univariable model, but its effect can vanish once relevant covariates are included
This is very common, especially in observational studies, with small datasets and highly correlated predictors.
They modeled:
\[ \Delta \text{NCPV} = \alpha + \beta \text{ApoB} + \varepsilon \]
But if age also predicts ΔNCPV and correlates with ApoB, then \(\beta\) is biased and partly reflects the effect of age.
A more appropriate model would be:
\[ \Delta \text{NCPV} = \alpha + \beta_1 \text{ApoB} + \beta_2 \text{Age} + \varepsilon \]
This separates the contribution of ApoB from that of age.
Univariable linear models make omitted-variable bias and likely confounding almost certain, especially in small non-randomized human data. This undermines any claim of association or non-association.
“Estimated lifetime LDL-C exposure was only a significant predictor of final NCPV in the univariable analysis but lost significance when age was included as a covariate. Both age and lifetime LDL-C exposure lost significance when baseline CAC was included in the model.”
Additionally, Table 3 does include one multivariable regression on the primary outcome \(\Delta \text{NCPV}\):
So they did include three multivariable models multivariable models: One on the main outcome \(\Delta \text{NCPV}\) and two in their Age Mediation Analysis on NCPV at follow-up.
None tested the full set of available covariates, nor explained why these specific models were chosen.
This selective modeling raises questions.
“Neither change in ApoB…baseline ApoB, nor total LDL-C exposure… were associated with the change in noncalcified plaque volume (NCPV) or TPS. All baseline plaque metrics (coronary artery calcium, NCPV, total plaque score, and percent atheroma volume) were strongly associated with the change in NCPV.”
Stating this in the abstract based solely on univariable regressions is a clear example of overinterpreting results drawn from limited statistical models.
At best, this reflects naïve reporting. At worst, it’s actively misleading.
Unusual. Fragile. Overstated.
Frequentist:
Assuming there is no true association between ApoB and ΔNCPV, how likely is it that we’d observe a slope as large (or larger) than the one we found, just by chance?
If p-value (p > 0.05) a frequentist analysis can say:
“We did not find sufficient evidence to reject the hypothesis that ApoB has no association with ΔNCPV”
It cannot say the null is likely true, or produce the probability that there is no association, just that the data were inconclusive.
Bayesian:
How well do the data fit under two competing models, one with no association (null), and one with a range of plausible effect sizes for ApoB (alternative)?
A Bayes factor (e.g., BF₁₀ = 6) allows a stronger statement:
“The observed data are 6 times more likely (moderate evidence) under the ‘no association’ model than under the alternative model that assumes some effect from ApoB (as defined by the prior).”
“Since lack of statistical significance (ie, P > 0.05) should not be interpreted as evidence in favor of the null but simply a failure to reject the null, the addition of Bayesian inference adds credence to finding that there is no association between NCPV vs LDL-C or ApoB…”
So, they turn to Bayesian inference to “support” their finding that ApoB has no association with plaque progression.
This is unusual in a non-randomized, uncontrolled, 1-year observational study on a highly restricted sample:
Despite the limited model and context, the interpret the Bayes factor as strongly confirmatory, which is uncommon in similar observational settings.
They are applying a more confirmatory-leaning statistical framework to an exploratory analysis that lacks adequate adjustment or robustness.
This is an ill-suited use of Bayesian inference.
Not because Bayesian methods are invalid.
Because they’re being used to amplify certainty or create unwarranted confidence from an analysis that lacks adjustment, control, or a design suitable for strong claims.
“Bayes factors were calculated using BayesFactor::regressionBF… and an ~ rscale value of 0.8 to contrast a moderately informative prior with a conservative distribution width (to allow for potential large effect sizes) due to the well-documented association between ApoB changes and coronary plaque changes”
Bayesian Prior: represents your belief about likely effect sizes before seeing the data.
From the BayesFactor documentation for the parameter rscaleCont:
“Several named values are recognized: ‘medium’, ‘wide’, and ‘ultrawide’, which correspond to rscales of √2/4, 1/2, and √2/2, respectively.”
rscale of 0.8 is wider than “ultrawide”. It is not a “moderately informative” prior. It’s actually a weakly informative or vague prior, placing most of its weight on large effects.
A moderately informative prior would typically correspond to “medium” (≈ 0.354) or “wide” (0.5), which place more mass on smaller effects.
‘We believe ApoB has a strong effect on plaque progression within one year. If we don’t observe strong effects in that time frame, we’ll treat that as evidence that ApoB likely has no effect.’
The authors’ choice of rscale isn’t wrong, but their description of it is misleading.
Labeling an rscale = 0.8 prior as “moderately informative” or “conservative” downplays the fact that it assumes large effects, making small observed effects look unlikely under H₁ and inflating support for H₀.
Their choice of prior is subjective, influential, and not tested for robustness.
The model expected large ApoB effects, so small observed effects are treated as evidence for no effect.
Best practice is to run a sensitivity analysis, to see whether conclusions change with different priors.
| rscale | BF₁₀ | BF₀₁ |
|---|---|---|
| 0.100 | 0.530 | 1.889 |
| 0.250 | 0.288 | 3.473 |
| 0.350 | 0.217 | 4.608 |
| 0.500 | 0.157 | 6.363 |
| 0.707 | 0.113 | 8.834 |
| 0.800 | 0.100 | 9.954 |
| 1.000 | 0.081 | 12.374 |
This kind of rscale sensitivity analysis is standard for default Bayes factors, but it’s a limited diagnostic — it tests only prior width, not prior plausibility or model fit.
“In other words, these data suggest it is 6 to 10 times more likely that the hypothesis of no association between these variables (the null) is true as compared to the alternative.”
A Bayes factor of 6–10 means the data are 6–10× more likely under the null model than under the alternative model, not that the null hypothesis is 6–10× more likely to be true.
They could have said: “The data are 6–10 times more likely under the no-association model than under the alternative.”
Bayes factors update prior odds into posterior odds. Claiming the null is 6–10× more likely assumes equal prior odds, something the authors never stated.
At least 14 distinct linear regressions were reported, with likely many more from exploratory models referenced in figures and supplements.
Numerous tests increase false positive risk, yet no multiple testing correction was applied.
Such as:
Perhaps the authors viewed this analysis as exploratory, where correction is often skipped —
but then why title the paper “Plaque Begets Plaque, ApoB Does Not”?
Baseline NCPV median = 44 mm³; TPS median = 0 → ≥50% of values are zero
CCTA cannot report negative plaque → both outcomes are left-censored at 0
This affects not just modeling but measurement:
When true plaque ≈ 0, error is asymmetric — it can only overestimate.
ΔNCPV, their primary outcome, is a change score between two bounded, skewed measures.
Likely to produce non-normal residuals and heteroscedasticity (e.g., larger spread at higher baseline).
If smaller baseline values were also linked to larger increases,
this may reflect the effects of left-censoring and error asymmetry — not true biological acceleration.
These issues are clear in TPS, and may affect NCPV, but diagnostics are not shown.
OLS assumes homoscedastic, normal residuals
performance::check_model() was run — but no output provided
They could have considered methods to address this such as: Tobit regression, log(+1) transform, or robust regression
Study abstract: “Plaque progression predictors were assessed with linear regression and Bayes factors. Diet adherence and baseline cardiovascular disease risk sensitivity analyses were performed.”
“Sensitivity analyses on participants with >80% of bHB measurements above 0.3 mmol/L (Supplemental Tables 2 to 4) and with high calculated 10-year cardiovascular risk showed similar results to those just reported.”
The authors conducted an apparent post hoc “sensitivity analyses” in two subgroups:
Without knowing if the subgroups are distinct, it’s hard to tell whether these findings confirm one another or simply re-analyze overlapping data — replication vs. redundancy.
The same individuals may be contributing to both sets of sensitivity analyses.
This limits interpretability and weakens claims of consistency across groups.
Exact same modeling strategy:
Results: “Sensitivity analyses… showed similar results to those just reported”.
| Model | Table 3 (Full, n = 100) |
Supplemental Table 4 (High Adherence, n = 56) |
Supplemental Table 5 (High CVD Risk, n = 28) |
|---|---|---|---|
| ΔNCPV ~ ΔApoB | β = 0.01 P = 0.91 BF > 10.0 |
β = 0.04 P = 0.63 BF = 6.90 |
β = 0.10 P = 0.57 BF = 4.76 |
| ΔNCPV ~ ApoB₀ | β = 0.06 P = 0.33 BF = 6.3 |
β = 0.06 P = 0.09 BF = 1.83 |
β = 0.11 P = 0.52 BF = 4.57 |
Compared to the full sample, the high adherence group offers only anecdotal evidence for the null.
The authors state that sensitivity analyses showed “similar results,” but provide no statistical comparisons, and no clarity on overlap between subgroups.
“It should be emphasized that this includes heterogeneity in progression (and regression) across the population.” - Keto-CTA study
The group is heterogeneous (to downplay the pooled NCPV change), yet they ran univariable regressions and interpreted Bayes factors on that group.
If a group is too heterogeneous to report pooled outcomes, it is also too heterogeneous for pooled inferences about predictors or mechanisms.
If their CVD risk (e.g., plaque progression) is not coherent, then the category fails as a predictive or explanatory label.
All clinical populations are heterogeneous in risk.
“p50=18.8 mm3 IQR(37.3).”
With a median of 18.8 mm³ and only 1–2 individuals showing regression, the IQR of 37.3 mm³ must be mostly skewed upward, not balanced.
A wide IQR here reflects high inter-individual variability in outcomes among the LMHRs, variability in progression of NCPV, as all but a few individuals had more plaque at follow-up.
In other words: With nearly all LMHRs showing plaque progression, a wide IQR (37.3 mm³) doesn’t indicate balanced variability—it reflects differing degrees of worsening.
This matters because:
Letter to the Editor - External researchers raise concerns about the study’s methodology
Response to the Letter - Study authors respond to concerns
From the response:
“Regarding the analytical points brought forward, we are aware of the relevance of linear assumptions to obtain accurate estimators. Since residual plot evaluation can also be subjective…”
Objective, quantitative statistical tests also exist for assessing model assumptions:
The performance::check_model() function they cite automatically generates most of these checks.
It’s unclear why they wouldn’t include or reference the output.
“…we followed their suggestion and re-ran all models with robust linear regression…as expected, there were small differences with the published estimates, all models using robust regression were consistent with what was reported.”
No output, diagnostics, or model fit provided. We are asked to trust their assertion.
“Additionally, we’ll highlight that the mean change in NCPV was 31.5 mm3 with a standard deviation also of 31.5 mm3. This highlights the impressive degree of heterogeneity in NCPV change.”
The SD of ΔNCPV (31.5 mm³), equal to its mean, was omitted from the original paper and reported in response to critique.
That high level of variation (“heterogeneity”) weakens power and makes all regression results less reliable — not just the null finding on ApoB.
“We completely agree our exploratory results (especially those suggesting a null association between plaque progression and ApoB) should be interpreted with caution”
“Moreover, our results are compatible with a causal role of ApoB in atherosclerosis, as we have openly acknowledged and supported in previous publications.”
“Plaque Begets Plaque, ApoB Does Not”
“Along the same lines, we would like to clarify that our title was not meant to be a statement about causality. “Plaque begets plaque” (which, of course, mirrors the proverb “Money begets money”) is frequently used to highlight the strong and clinically relevant association of baseline plaque values with plaque progression rate [7].
In that citation [7], Yoon et al. did perform t-tests on change scores, but their main conclusions relied on a random-effects repeated-measures multivariable regression model of follow-up CAC, adjusted for baseline plaque and clinical risk factors.
The paper did not report any univariable linear regression models.
“In retrospect, we might have chosen “Longitudinal Data from the KETO-CTA Study” as alternative phrasing to avoid misinterpretations.”
Plaque Begets Plaque, ApoB Does Not
“misinterpretations”